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Abstract. We describe an implementation of a genetic algorithm on 
partially commutative groups and apply it to the double coset search 
problem on a subclass of groups. This transforms a combinatorial group 
theory problem to a problem of combinatorial optimisation. We obtain 
a method applicable to a wide range of problems and give results which 
indicate good behaviour of the genetic algorithm, hinting at the presence 
of a new deterministic solution and a framework for further results. 



1 Introduction 

1.1 History and Background 

Genetic algorithms (hereafter referred to as GAs) were introduced by Holland 
[3] and have enjoyed a recent renaissance in many applications including engi- 
neering, scheduling and attacking problems such as the travelling salesman and 
graph colouring problems. However, the use of GAs in group theory [11718) has 
been in operation for a comparatively short time. 

This paper discusses an adaptation of GAs for word problems in combinato- 
rial group theory. We work inside the Vershik groups [TT] , a subclass of partially 
commutative groups (also known as graph groups [TU] and trace groups). We 
omit a survey of the theory of the groups here and focus on certain applications. 

There exists an explicit solution for many problems in this setting. The bi- 
automaticity of the partially commutative groups is established in [TU] , so as a 
corollary the conjugacy problem is solvable. Wrathall [T3] gave a fast algorithm 
for the word problem based upon restricting the problem to a monoid generated 
by group generators and their formal inverses. In [13j . an algorithm is given for 
the conjugacy problem; it is linear time by a stack-based computation model. 

Our work is an experimental investigation of GAs in this setting to determine 
why they seem to work in certain areas of combinatorial group theory and to de- 
termine bounds for what happens for given problems. This is done by translating 
given word problems to ones of combinatorial optimisation. 

1.2 Partially Commutative Groups and Vershik Groups 

Let X = {x\, X2, ■ ■ • , x n } be a finite set and define the operation of multiplication 
of Xi,Xj G X to be the juxtaposition XiXj. As in [T3], we specify a partially 
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commutative group G(X) by X and the collection of all elements from X that 
commute; that is, the set of all pairs (Xi,Xj) such that Xi,Xj £ X and XiXj = 
XjXi. For example, take X = {21, X2, £3, £4} and suppose that 2124 = 2421 and 
X2X3 = 232:2 ■ Then we denote this group G(X) = (X : [21,24], [22,23]}. 

The elements of X are called generators for G(X). Note that for general 
G(X) some generators commute and some do not, and there are no other non- 
trivial relations between the generators. We concentrate on Vershik groups, a 
particular subclass of the above groups. For a set X with n elements as above, 
the Vershik group of rank n over X is given by 

V n = (X:[x i ,x j ] if |i-i|>2). 

For example, in the group V4 the pairs of elements that commute with each other 
are (21,23), (21,24) and (22,24). We may also write this as V(X) assuming an 
arbitrary set X . The elements of V n are represented by group words written as 
products of generators. The length, l(u), of a word u S V n is the minimal number 
of single generators from which u can be written. For example u = 212224 £ V4 
is a word of length three. We use x^ to denote [i successive multiplications of 
the generator xi; for example, x\ = 22222222. Denote the empty word e £ V n . 

For a subset, Y, of the set X we say the Vershik group V(Y) is a parabolic 
subgroup of V(X). It is easily observed that any partially commutative group G 
may be realised as a subgroup of a Vershik group V n of sufficiently large rank n. 

Vershik [TT] solved the word problem in V n by means of reducing words to 
their normal form. The Knuth-Bcndix normal form of a word u £ V n of length 
l(u) may be thought of as the "shortest form" of u and is given by the unique 
expression 

" — •*% d -i 2 ■ ■ ■ ^ik 

such that all /ij 7^ 0, l(u) — \l JL i\ an d 

i) if ij = 1 then ij+\ > 1; 

ii) if ij = m < n then ij+\ = m — 1 or ij+i > m; 

iii) if ij = n then = n — 1. 

The name of the above form follows from the Knuth-Bendix algorithm with 
ordering 21 < x^ 1 < 22 < x^ 1 < ■ ■ ■ < x n < 2" 1 . We omit further discussion of 
this here; the interested reader is referred to [5] for a description of the algorithm. 

The algorithm to produce the above normal form is essentially a restriction 
of the stack-based (or heap-based) algorithm of p~2] to the Vershik group, and 
we thus conjecture that the normal form of a word u £ V n may be computed 
efficiently in time O (l(u) logZ(u)) for the "average case". From now on we write 
u to mean the normal form of the word u. For a word u £ V n , we say that 

RF{u) = {xf : l{uxj a ) = l(u) - 1, a = ±1} 
is the roof of u and 



FL(u) = {x* : l{x- a u) = l{u) - 1, a = ±1} 
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is the floor of u. The roof (and floor) of u correspond to the generators which 
may be cancelled after their inverses are juxtaposed to the right (and left) end 
of u to create the word v! and vl is reduced to its normal form u'. For example, 
if u = X2XqX^ 1 X4X1 then RF(u) = {x%, £4} and FL(u) — {x^ 1 , x§). 

2 Statement of Problem 

Given a Vershik group V n and two words a, b in the group, we wish to determine 
whether a and b lie in the same double coset with respect to given subgroups. 
In other words, consider the following problem: 

The Double Coset Search Problem (DCSP) Given two parabolic sub- 
groups V(Y) and V(Z) of a Vershik group V n and two words a, b G V n such that 
b G V(Y) a V(Z), find words x G V(Y) and y G V{Z) such that b = xay. 

We attack this group-theoretic problem by transforming it into one of com- 
binatorial optimisation. In the following exposition, an instance of the DCSP is 
specified by a pair (a, b) of given words, each in V n , and the notation M.{{a, b)) 
denotes the set of all feasible solutions to the given instance. We will use a GA 
to iteratively produce "approximations" to solutions to the DCSP, and denote 
an "approximation" for a solution (x.y) G A4((a,b)) by (x,C) G V(Y) x V(Z). 

Combinatorial Optimisation DCSP 

Input: Two words a, b G V n . 

Constraints: M((a, b)) = {( X , C) G V(Y) x V{Z) : X < = b}. 
Costs: The function C((x, 0) = KxaCb- 1 ) > 0. 
Goal: Minimise C. 

The cost of the pair {x, C) is a non-negative integer imposed by the above 
function C . The length function defined on V n takes non-negative values; hence 
an optimal solution for the instance is a pair (x, C) such that C((x,Q) = 0- 
Therefore our goal is to minimise the cost function C. 

As an application of our work, note that the Vershik groups are inherently 
related to the braid groups, a rich source of primitives for algebraic cryptography. 
In particular, the DCSP in the Vershik groups is an analogue of an established 
braid group primitive. The reader is invited to consult jj>j for further details. 

In the next section we expand these notions and detail the method we use 
to solve this optimisation problem. 

3 Genetic Algorithms on Vershik Groups 
3.1 An Introduction to the Approach 

For brevity we do not discuss the elementary concepts of GAs here, but refer the 
reader to [419] for a discussion of GAs and remark that we use standard terms 
such as cost-proportionate selection and reproductive method in a similar way. 
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We give a brief introduction to our approach. We begin with an initial popu- 
lation of "randomly generated" pairs of words, each pair of which is treated as an 
approximation to a solution (x, y) G M((a, b)) of an instance (a, b) of the DCSP. 
We explicitly note that the GA does not know either of the words x or y. Each 
pair of words in the population is ranked according to some cost function which 
measures how "closely" the given pair of words approximates (x,y). After that 
we systematically imitate natural selection and breeding methods to produce a 
new population, consisting of modified pairs of words from our initial population. 
Each pair of words in this new population is then ranked as before. We continue 
to iterate populations in this way to gather steadily closer approximations to a 
solution (x,y) until we arrive at a solution (or otherwise). 

3.2 The Representation and Computation of Words 

We work in V n and two given parabolic subgroups V(Y) and V(Z), and wish 
the GA to find an exact solution to a posed problem. We naturally represent a 
group word u = xf^xf^ . . . x^ k of arbitrary length by a string of integers, where 
we consecutively map each generator of the word u as follows: 

f +i if a = +1 
x% if ei = -l 

For example, if u = x^x 4X2x^x7 € V7 then u is represented by the string 
-1 4 2 3 3 7. In this context the length of u is equal to the number of integers 
in its string representation. We define a chromosome to be the GA representation 
of a pair C) of words, and note that each word is naturally of variable length. 
Moreover a population is a multiset of a fixed number p of chromosomes. The GA 
has two populations in memory, the current population and the next generation. 
As with traditional GAs the current population contains the chromosomes under 
consideration at the current iteration of the GA, and the next generation has 
chromosomes deposited into it by the GA which form the current population on 
the next iteration. A subpopulation is a submultiset of a given population. 

We use the natural representation for ease of algebraic operation, acknowl- 
edging that faster or more sophisticated data structures exist, for example the 
stack-based data structure of [13] . However we believe the simplicity of our rep- 
resentation yields relatively uncomplicated reproductive algorithms. In contrast, 
we believe a stack-based data structure yields reproductive methods of consid- 
erable complexity. We give our reproductive methods in the next subsection. 

Besides normal form reduction of a word u we use pseudo-reduction of u. Let 
{ Xi, , xj. ... ,xa. , x~ l 1 be the generators which would be removed from u if 

Jl j\ Jm jrn 

we were to reduce u to normal form. Pseudo-reduction of u is defined as simply 
removing the above generators from u. There is no reordering of the resulting 
word (as with normal form). For example, if u = xqXsX^ 1 X2X^ x^Xqx^x^ then 
its pseudo-normal form is u = xqx^ 1 xqX4X^ and the normal form of u is u = 
x\ x X4x\x^. Clearly, we have l(u) — l(u). This form is efficiently computable, 
with complexity at most that of the algorithm used to compute the normal form 
u. Note, a word is not assumed to be in any given form unless otherwise stated. 
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3.3 Reproduction 

The following reproduction methods are adaptations of standard GA reproduc- 
tion methods. The methods act on a subpopulation to give a child chromosome, 
which we insert into the next population (more details are given in section [5]) . 

1. Sexual (crossover): by some selection function, input two parent chromo- 
somes ci and C2 from the current population. Choose one random segment 
from ci, one from C2 and output the concatenation of the segments. 

2. Asexual: input a parent chromosome c, given by a selection function, from the 
current population. Output one child chromosome by one of the following: 

(a) Insertion of a random generator into a random position of c. 

(b) Deletion of a generator at a random position of c. 

(c) Substitution of a generator located at a random position in c with a 
random generator. 

3. Continuance: return several chromosomes ci, C2, . . . , c m chosen by some se- 
lection algorithm, such that the first one returned is the "fittest" chromosome 
(see the next subsection). This method is known as partially elitist. 

4. Non-Local Admission: return a random chromosome by some algorithm. 

With the exception of continuance, the methods are repeated for each child 
chromosome required. 

3.4 The Cost Function 

In a sense, a cost function induces a partial metric over the search space to 
give a measure of the "distance" of a chromosome from a solution. Denote the 
solution of an instance of the DCSP in section [2] by (x, y) and a chromosome by 
(x, 0- Let E(x, C) = X a Cb _1 ; for simplicity we denote this expression by E. The 
normal form of the above expression is denoted E. When (x, C) is a solution to 
an instance, we have E = e (the empty word) with defined length 1(E) = 0. 

The cost function we use is as follows: given a chromosome (x, C) its cost 
is given by the formula C((%, £)) = 1(E). This value is computed for every 
chromosome in the current population at each iteration of the GA. This means 
we seek to minimise the value of C((x, C)) as we iterate the GA. 

3.5 Selection Algorithms 

We realise continuance by roulette wheel selection. This is cost proportionate. 
As we will see in Algorithm [21 we implicitly require the population to be ordered 
best cost first. To this end, write the population as a list {(xi, Ci); ■ ■ ■ > (Xp> Cp)} 
where C(xi, Ci) < C(x2, ( 2 ) < • • < C(x P , Cp)- Then the algorithm is as follows: 

Algorithm 1 (Roulette Wheel Selection) 

Input: The population size p; the population chromosomes (Xi->d)> their costs 
C((xi,Ci))> andn s , the number of chromosomes to select 
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Output: n s chromosomes from the population 
1. LetW^Z P i=1 C((xi,(i)); 

w 



Compute the sequence {p s } such that p s ((xi, (i)) <~ ■ 



3. Reverse the sequence {p s }; 

4. For j = 1, . . . ,p, compute q i <- Y<l=i Ps((Xi, C*))/ 

5. For t = 1, . . . , n s , do 

(a) Ift = l output (xi,Ci); the chromosome with least cost. End. 

(b) Else 

i. Choose a random r E [0, 1]; 
it. Output (xfe)Cfc) suc h ^at qu-i < r < qt- End. 

The algorithm respects the requirement that chromosomes with least cost 
are selected more often. For crossover we use tournament selection, where we 
input three randomly chosen chromosomes in the current population and select 
the two with least cost. If all three have identical cost, then select the first two 
chosen. Selection of chromosomes for asexual reproduction is at random from 
the current population. 



4 Traceback 

In many ways, cost functions are a large part of a GA. But the reproduction 
methods often specify that a random generator is chosen, so reducing the number 
of possible choices of generator may serve to guide the GA. We give a possible 
approach to reducing this number and term it traceback. In brief, we take the 
problem instance given by the pair (a, b) and use a and b to determine properties 
of a feasible solution (x,y) <E M((a, b)) to the instance. This approach exploits 
the "geometry" of the search space by tracking the process of reduction of E to 
its normal form in V n and proceeds as follows: 

Recall Y and Z respectively denote the set of generators of the parabolic 
subgroups G(Y) and G{Z). Suppose we have a chromosome (x, C) at some stage 
of the GA computation. Form the expression E = \aCb~ 1 associated to the given 
instance of the DCSP and label each generator from \ an d C with its position in 
the product xC- Then reduce E to its normal form E; during reduction the labels 
travel with their associated generators. As a result some generators from x or £ 
may be cancelled or not, and the set of labels of the non-cancelled generators of 
X and £ give the original positions. 

The generators in V n which commute mean that the chromosome may be split 
into blocks {Pi}. Each block is formed from at least one consecutive generator 
of x and ( which move together under reduction of E. Let B be the set of all 
blocks from the above process. Now a block [3 m £ B and a position q (which 
we call the recommended position) at either the left or right end of that block 
are randomly chosen. Depending upon the position chosen, take the subword 5 
between either the current and next block /3 m +i or the current and prior block 
(3 m -i (if available). If there is just one block, then take 5 to be between [3\ and 
the end or beginning of E. 
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Then identify the word x 01 C from which the position q originated and its 
associated generating set S — Y or S = Z. The position q is at either the left 
or right end of the chosen block. So depending on the end of the block chosen, 
randomly select the inverse of a generator from RF(5) n S or FL{5) PI S. Call 
this the recommended generator g. Note if both \ an d C are entirely cancelled 
(and so B is empty), we return a random recommended generator and position. 

With these, the insertion algorithm inserts the inverse of the generator on 
the appropriate side of the recommended position in \ or C I n the cases of 
substitution and deletion, we substitute the recommended generator or delete 
the generator at the recommended position. We now give an example for the 
DCSP on V^io with the two parabolic subgroups of V(Y) = V-j and V(Z) — Vio- 

Example of Traceback on a Given Instance Take the short DCSP instance 

(a, b) = (x^X^X^X^X^X-jX^XqXiq, X2X4X5X^ 1 X3XrXQ 1 XioXg) 

and let the current chromosome be (x, C) = ( x 3 x 2 ~ X 5 X 7 x^x^x-^x^ 1 x w ). 
Represent the labels of the positions of the generators in \ an d C by the following 
numbers immediately above each generator: 



12 3 4 

X3 -2^2 



5 6 7 8 9 

x 5 x 2 x 3 X7 1 xio 



Forming E and reducing it to its Knuth-Bcndix normal form gives 
12 3 4 

X 3 X^ 1 X^ 1 X 2 X2 X 3 X? 1 X5 X4 X5 X4 1 X-j X7 

5 8 " 9 

Xq X§ X^. X*j Xq X ^ X ^ Xfj Xq -3^10 X\q Xq *^10 

which contains eight remaining generators from (x, ()■ Take cost to be C((x, ()) = 
1(E) = 26, the number of generators in E above. There are three blocks for x- 

R - 1 2 R - 3 R - 4 

Pi — -1 1 P2 — 1 P3 — 

I3I2 J 3 x 5 x 7 



and three for (: 



R 5 R 8 R 9 
A= X 5 ' A= af l, 06=^ 



Suppose we choose position eight, which is in Q and is block /3 5 . This is a block 
of length one; we may take the word to the left or the right as our choice for 5. 

Suppose we choose the word to the right, so 5 — x^x^ 1 x^ 1 x^ 1 x 9 x w and in 
this case, S — {xi, . . . , x w }. So we choose a random generator from FL(S)(~\S = 
{xq, xg}. Choose g — x^ 1 and so ( becomes £' = X5X2X3X7 x§ x\o, with x' = X- 
The cost becomes C((x', (')) — /(x' a C'^ _1 ) = 25. Note that we could have taken 
any block and the permitted directions to create 5. In this case, there are eleven 
choices of 5, clearly considerably fewer than the total number of subwords of 
E. Traceback provides a significant increase in performance over merely random 
selection (this is easily calculated in the above example to be by a factor of 38). 
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5 Setup of the Genetic Algorithm 
5.1 Specification of Output Alphabet 

Let n — 2m for some integer m > 1. Define the subsets of generators Y = 
{x%, . . . , x m -i}, Z = {x m +2, ■ ■ ■ , x n } and two corresponding parabolic subgroups 
G(Y) = (Y),G(Z) = (Z). Clearly G(Y) and G(Z) commute as groups: if we 
take any m > 1 and any words x y G G(Y), x z G G(Z) then x y x z — x z x y . We 
direct the interested reader to [5] for information on the importance of the pre- 
ceding statement. Given an instance (a, b) of the DCSP with parabolic subgroups 
as above, we will seek a representative for each of the two words x G G(Y) and 
y G G(Z) that are a solution to the DCSP. Let us label this problem (P). 



5.2 The Algorithm and its Parameters 

Given a chromosome (x, C) we choose crossover to act on either x or £ at random, 
and fix the other component of the chromosome. Insertion is performed according 
to the position in x or (, given by traceback and substitution is with a random 
generator, both such that if the generator chosen cancels with a neighbouring 
generator from the word then another random generator is chosen. We choose to 
use pseudo-normal form for all chromosomes to remove all redundant generators 
while preserving the internal ordering of (x>0- 

By experiment, GA behaviour and performance is mostly controlled by the 
parameter set chosen. A parameter set is specified by the population size p and 
numbers of children begat by each reproduction algorithm. The collection of 
numbers of children is given by a multiset of non- negative integers P = {pi}, 
where ^2pi — p and each pi is given, in order, by the number of crossovers, 
substitutions, deletions, insertions, selections and random chromosomes. The 
GA is summarised by the following algorithm: 

Algorithm 2 (GA for DCSP) 

Input: The parameter set, words a,b and their lengths 1(a), 1(b), suicide control a, 
initial length L\ 

Output: A solution (x, C) or timeout; i, the number of populations 

1. Generate the initial population Pq, consisting of p random (unreduced) chro- 
mosomes (x, C) of initial length Li; 

2. i «- 0; 

3. Reduce every chromosome in the population to its pseudo-normal form. 
4- While i < a do 

(a) For j — 1, . . . ,p do 

i. Reduce each pair (xj, Q) G Pi to its pseudo-normal form (xj, Cj)i 

ii. Form the expression E — Xj a Cj 

Hi. Perform the traceback algorithm to give C((xj,Cj))> recommended 
generator g and recommended position q; 
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(b) Sort current population Pi into least-cost-first order and label the chro- 
mosomes (xi, Ci), •_■ -,(x P , ( P ); 

(c) If the cost o/(xi,Ci) * s zero then return solution (xi)Cl) an d END. 

(d) P l+1 «- 0; 

(ej For j = 1, . . . ,p do 

i. Using the data obtained in step 4(a) (in), perform the appropriate 
reproductive algorithm on (XjiCj) an ^ denote the result (Xj>Cj)> 

it. ^F 4+1 u{(x;-,Cj)}; ' 

r/; »<-* + i. 

5. Return failure. END. 

The positive integer a is an example of a suicide control, where the GA stops 
(suicide) if more than a populations have been generated. In all cases here, a is 
chosen by experimentation; GA runs that continued beyond a populations were 
unlikely to produce a successful conclusion. By deterministic search we found a 
population size of p = 200 and parameter set P = {5, 33, 4, 128, 30, 0} for which 
the GA performs well when n = 10. We observed that the GA exhibits the 
well-known common characteristic of sensitivity to changes in parameter set; we 
consider this in future work. We found an optimal length of one for each word in 
our initial population, and now devote the remainder of the paper to our results 
of testing the GA and analysis of the data collected. 

5.3 Method of Testing 

We wished to test the performance of the GA on "randomly generated" instances 
of problem (P). Define the length of an instance of (P) to be the set of lengths 
{1(a), l(x), l(y)} of words a,x,y £ V n used to create that instance. Each of the 
words a, x and y are generated by simple random walk on V n . To generate a word 
u of given length k — l(u) firstly generate the unreduced word u\ with unreduced 
length l(ui) — k. Then if l(ui) < k, generate u 2 of unreduced length k — l(ui), 
take U1U2 and repeat this procedure until we produce a word u = u\U2 ■ ■ ■ u r 
with l(u) equal to the required length k. 

We identified two key input data for the GA: the length of an instance of (P) 
and the group rank, n. Two types of tests were performed, varying these data: 

1. Test of the GA with long instances while keeping the rank small; 

2. Test of the GA with instances of moderate length while increasing the rank. 

The algorithms and tests were developed and conducted in GNU C++ on a 
Pentium IV 2.53 GHz computer with 1GB of RAM running Debian Linux 3.0. 

5.4 Results 

Define the generation count to be the number of populations (and so iterations) 
required to solve a given instance; see the counter i in Algorithm [2] We present 
the results of the tests and follow this in section l5~5l with discussion of the results. 
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Increasing Length We tested the GA on eight randomly generated instances 
(II)— (18) with the rank of V n set at n = 10. The instances (II)— (18) were gen- 
erated beginning with 1(a) — 128 and l(x) — l(y) = 16 for instance (II) and 
progressing to the following instance by doubling the length 1(a) or both of the 
lengths l(x) and l(y). The GA was run ten times on each instance and the mean 
runtime t in seconds and mean generation count g across all runs of that instance 
was taken. For each collection of runs of an instance we took the standard devia- 
tion o g of the generation counts and the mean time in seconds taken to compute 
each population. A summary of results is given in Table [1] 

Table 1. Results of increasing instance lengths for constant rank n = 10. 



Instance 


1(a) l(x) 


m 


9 


t 


a 3 


sec/gen 


11 


128 16 


16 


183 


59 


68.3 


0.323 


12 


128 32 


32 


313 


105 


198.5 


0.339 


CO 


256 64 


64 


780 


380 


325.5 


0.515 


14 


512 64 


64 


623 


376 


205.8 


0.607 


15 


512 128 


128 


731 


562 


84.4 


0.769 


16 


1024 128 


128 


1342 


801 


307.1 


0.598 


17 


1024 256 


256 


5947 


5921 


1525.3 


1.004 


18 


2048 512 


512 


14805 58444 


3576.4 


3.849 



Increasing Rank These tests were designed to keep the lengths of computed 
words relatively small while allowing the rank n to increase. We no longer impose 
the condition of l(x) = l(y). Take s to be the arithmetic mean of the lengths of 
x and y. Instances were constructed by taking n = 10, 20 or 40 and generating 
random a of maximal length 750, random x and y of maximal length 150 and 
then reducing the new b = xay to its normal form b. 

We then ran the GA once on each of 505 randomly generated instances for 
n = 10, with 145 instances for n = 20 and 52 instances for n = 40. We took the 
time t in seconds to produce a solution and the respective generation count g. 
The data collected is summarised on Tabled] by grouping the length s of instance 
into intervals of length fifteen. For example, the range 75-90 means all instances 
where s € [75, 90). Across each interval we computed the means ~g and t along 
with the standard deviation a g . We now give a brief discussion of the results and 
some conjectures, and then conclude our work. 

5.5 Discussion and Conclusion 

Firstly, the mean times given on Tables [Hand [2] depend upon the time complexity 
of the underlying algebraic operations. We conjecture for n = 10 that these have 
time complexity no greater than O(klogk) where k is the mean length of all 
words across the entire run of the GA that we wish to reduce. 

Table [1] shows we have a good method for solving large scale problems when 
the rank is n = 10. By Table [2] we observe the GA operates very well in most 
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Table 2. Results of increasing rank from n = 10 (upper rows) to n — 20 (centre 
rows) and n = 40 (lower rows). 



s 


15-30 


30-45 


45-60 


60-75 


75-90 


90-105 


105-120 


120-135 


135-150 


9 


227 


467 


619 


965 


1120 


1740 


1673 


2057 


2412 


t 


44 


94 


123 


207 


244 


384 


399 


525 


652 


9 


646 


2391 


2593 


4349 


4351 


8585 


8178 


8103 


10351 


t 


251 


897 


876 


1943 


1737 


3339 


3265 


4104 


4337 


9 


1341 


1496 


2252 


1721 


6832 


14333 


14363 






t 


949 


1053 


836 


1142 


5727 


10037 


11031 







cases across problems where the mean length of x and y is less than 150 and rank 
at most forty. Fixing s in a given range, the mean generation count increases at 
an approximately linearithmic rate as n increases. This seems to hold for all n 
up to forty, so we conjecture that for a mean instance of problem (P) with given 
rank n and instance length s the generation count for an average run of the GA 
lies between O(sn) and 0(sn log n). This conjecture means the GA generation 
count depends linearly on s (for brevity, we omit the statistical evidence here). 

As n increases across the full range of instances of (P) , increasing numbers of 
suicides tend to occur as the GA encounters increasing numbers of local minima. 
These may be partially explained by observing traceback. For n large, we are 
likely to have many more blocks than for n small (as the likelihood of two 
arbitrary generators commuting is larger). While traceback is much more efficient 
than a purely random method, there are more chances to read 5 between blocks. 
Indeed, there may be so many possible 8 that it takes many GA iterations to 
reduce cost. By experience of this situation, non-asexual methods of reproduction 
bring the GA out of some local minima. Consider the following typical GA 
output, where the best chromosomes from populations 44 and 64 (before and 
after a local minimum) are: 

Gen 44 (c = 302) : x = 9 6 5 6 7 4 5 -6 7 5 -3 -3 (1 = 12) 
y = -20 14 12 14 -20 -20 (1 = 6) 

Gen 64 (c = 300) :x=9817656745-6 795-3-3(l=16) 

y = 14 12 12 -20 14 15 -14 -14 -16 17 15 14 -20 15 -19 -20 -20 -19 
-20 18 -17 -16 (1 = 22) 

In this case, cost reduction is not made by a small change in chromosome length, 
but by a large one. We observe that the cost reduction is made when a chro- 
mosome from lower in the ordered population is selected and then mutated, as 
the new chromosome at population 64 is far longer. In this case it seems trace- 
back acts as a topological sorting method on the generators of the equation E, 
giving complex systems of cancellation in E which result in a cost deduction 
greater than one. This suggests that fmetuning the parameter set to focus more 
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on reproduction lower in the population and reproduction which causes larger 
changes in word length may improve performance. Indeed, [3] conjectures that 

"It seems plausible to conjecture that sexual mating has the purpose to 
overcome situations where asexual evolution is stagnant." 

Bremermann [3, p. 102] 

This implies the GA performs well in comparison to asexual hillclimbing meth- 
ods. Indeed, this is the case in practice: by making appropriate parameter choices 
we may simulate such a hillclimb, which experimentally encounters many more 
local minima. These local minima seem to require substantial changes in the 
form of x an d C ( as above); this cannot be done by mere asexual reproduction. 

Meanwhile, coupled with a concept of "growing" solutions, we have at least 
for reasonable values of n an indication of a good underlying deterministic algo- 
rithm based on traceback. Indeed, such deterministic algorithms were developed 
in [2] as the result of analysis of experimental data in our work. This hints that 
the search space has a "good" structure and may be exploited by appropriately 
sensitive GAs and other artificial intelligence technologies in our framework. 
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